On the Underestimation of Model Uncertainty by Bayesian K-nearest Neighbors

نویسندگان

  • Wanhua Su
  • Hugh Chipman
  • Mu Zhu
چکیده

When using the K-nearest neighbors method, one often ignores uncertainty in the choice of K. To account for such uncertainty, Holmes and Adams (2002) proposed a Bayesian framework for K-nearest neighbors (KNN). Their Bayesian KNN (BKNN) approach uses a pseudo-likelihood function, and standard Markov chain Monte Carlo (MCMC) techniques to draw posterior samples. Holmes and Adams (2002) focused on the performance of BKNN in terms of misclassification error but did not assess its ability to quantify uncertainty. We present some evidence to show that BKNN still significantly underestimates model uncertainty.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Pseudo-Likelihood Inference Underestimates Model Uncertainty: Evidence from Bayesian Nearest Neighbours

When using the K-nearest neighbours (KNN) method, one often ignores the uncertainty in the choice of K. To account for such uncertainty, Bayesian KNN (BKNN) has been proposed and studied (Holmes and Adams 2002 Cucala et al. 2009). We present some evidence to show that the pseudo-likelihood approach for BKNN, even after being corrected by Cucala et al. (2009), still significantly underest...

متن کامل

A Novel Hybrid Approach for Email Spam Detection based on Scatter Search Algorithm and K-Nearest Neighbors

Because cyberspace and Internet predominate in the life of users, in addition to business opportunities and time reductions, threats like information theft, penetration into systems, etc. are included in the field of hardware and software. Security is the top priority to prevent a cyber-attack that users should initially be detecting the type of attacks because virtual environments are not moni...

متن کامل

A comparative study of performance of K-nearest neighbors and support vector machines for classification of groundwater

The aim of this work is to examine the feasibilities of the support vector machines (SVMs) and K-nearest neighbor (K-NN) classifier methods for the classification of an aquifer in the Khuzestan Province, Iran. For this purpose, 17 groundwater quality variables including EC, TDS, turbidity, pH, total hardness, Ca, Mg, total alkalinity, sulfate, nitrate, nitrite, fluoride, phosphate, Fe, Mn, Cu, ...

متن کامل

Efficient model selection for probabilistic K nearest neighbour classification

ProbabilisticK-nearest neighbour (PKNN) classification has been introduced to improve the performance of the original K-nearest neighbour (KNN) classification algorithm by explicitly modelling uncertainty in the classification of each feature vector. However, an issue common to both KNN and PKNN is to select the optimal number of neighbours, K. The contribution of this paper is to incorporate t...

متن کامل

Efficient Estimation of the number of neighbours in Probabilistic K Nearest Neighbour Classification

Probabilistic k-nearest neighbour (PKNN) classification has been introduced to improve the performance of original k-nearest neighbour (KNN) classification algorithm by explicitly modelling uncertainty in the classification of each feature vector. However, an issue common to both KNN and PKNN is to select the optimal number of neighbours, k. The contribution of this paper is to incorporate the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008